Goto

Collaborating Authors

 correlation feature


Inferring the Graph Structure of Images for Graph Neural Networks

Gowda, Mayur S, Shi, John, Santos, Augusto, Moura, José M. F.

arXiv.org Artificial Intelligence

Image datasets such as MNIST are a key benchmark for testing Graph Neural Network (GNN) architectures. The images are traditionally represented as a grid graph with each node representing a pixel and edges connecting neighboring pixels (vertically and horizontally). The graph signal is the values (intensities) of each pixel in the image. The graphs are commonly used as input to graph neural networks (e.g., Graph Convolutional Neural Networks (Graph CNNs) [1, 2], Graph Attention Networks (GAT) [3], GatedGCN [4]) to classify the images. In this work, we improve the accuracy of downstream graph neural network tasks by finding alternative graphs to the grid graph and superpixel methods to represent the dataset images, following the approach in [5, 6]. We find row correlation, column correlation, and product graphs for each image in MNIST and Fashion-MNIST using correlations between the pixel values building on the method in [5, 6]. Experiments show that using these different graph representations and features as input into downstream GNN models improves the accuracy over using the traditional grid graph and superpixel methods in the literature.


AerOSeg: Harnessing SAM for Open-Vocabulary Segmentation in Remote Sensing Images

Dutta, Saikat, Vasim, Akhil, Gole, Siddhant, Rezatofighi, Hamid, Banerjee, Biplab

arXiv.org Artificial Intelligence

Image segmentation beyond predefined categories is a key challenge in remote sensing, where novel and unseen classes often emerge during inference. Open-vocabulary image Segmentation addresses these generalization issues in traditional supervised segmentation models while reducing reliance on extensive per-pixel annotations, which are both expensive and labor-intensive to obtain. Most Open-Vocabulary Segmentation (OVS) methods are designed for natural images but struggle with remote sensing data due to scale variations, orientation changes, and complex scene compositions. This necessitates the development of OVS approaches specifically tailored for remote sensing. In this context, we propose AerOSeg, a novel OVS approach for remote sensing data. First, we compute robust image-text correlation features using multiple rotated versions of the input image and domain-specific prompts. These features are then refined through spatial and class refinement blocks. Inspired by the success of the Segment Anything Model (SAM) in diverse domains, we leverage SAM features to guide the spatial refinement of correlation features. Additionally, we introduce a semantic back-projection module and loss to ensure the seamless propagation of SAM's semantic information throughout the segmentation pipeline. Finally, we enhance the refined correlation features using a multi-scale attention-aware decoder to produce the final segmentation map. We validate our SAM-guided Open-Vocabulary Remote Sensing Segmentation model on three benchmark remote sensing datasets: iSAID, DLRSD, and OpenEarthMap. Our model outperforms state-of-the-art open-vocabulary segmentation methods, achieving an average improvement of 2.54 h-mIoU.


TCGPN: Temporal-Correlation Graph Pre-trained Network for Stock Forecasting

Yan, Wenbo, Tan, Ying

arXiv.org Machine Learning

Recently, the incorporation of both temporal features and the correlation across time series has become an effective approach in time series prediction. Spatio-Temporal Graph Neural Networks (STGNNs) demonstrate good performance on many Temporal-correlation Forecasting Problem. However, when applied to tasks lacking periodicity, such as stock data prediction, the effectiveness and robustness of STGNNs are found to be unsatisfactory. And STGNNs are limited by memory savings so that cannot handle problems with a large number of nodes. In this paper, we propose a novel approach called the Temporal-Correlation Graph Pre-trained Network (TCGPN) to address these limitations. TCGPN utilize Temporal-correlation fusion encoder to get a mixed representation and pre-training method with carefully designed temporal and correlation pre-training tasks. Entire structure is independent of the number and order of nodes, so better results can be obtained through various data enhancements. And memory consumption during training can be significantly reduced through multiple sampling. Experiments are conducted on real stock market data sets CSI300 and CSI500 that exhibit minimal periodicity. We fine-tune a simple MLP in downstream tasks and achieve state-of-the-art results, validating the capability to capture more robust temporal correlation patterns.


Identification of 4FGL uncertain sources at Higher Resolutions with Inverse Discrete Wavelet Transform

Cao, Haitao, Xiao, Hubing, Luo, Zhijian, Zeng, Xiangtao, Fan, Junhui

arXiv.org Artificial Intelligence

In the forthcoming era of big astronomical data, it is a burden to find out target sources from ground-based and space-based telescopes. Although Machine Learning (ML) methods have been extensively utilized to address this issue, the incorporation of in-depth data analysis can significantly enhance the efficiency of identifying target sources when dealing with massive volumes of astronomical data. In this work, we focused on the task of finding AGN candidates and identifying BL Lac/FSRQ candidates from the 4FGL DR3 uncertain sources. We studied the correlations among the attributes of the 4FGL DR3 catalogue and proposed a novel method, named FDIDWT, to transform the original data. The transformed dataset is characterized as low-dimensional and feature-highlighted, with the estimation of correlation features by Fractal Dimension (FD) theory and the multi-resolution analysis by Inverse Discrete Wavelet Transform (IDWT). Combining the FDIDWT method with an improved lightweight MatchboxConv1D model, we accomplished two missions: (1) to distinguish the Active Galactic Nuclei (AGNs) from others (Non-AGNs) in the 4FGL DR3 uncertain sources with an accuracy of 96.65%, namely, Mission A; (2) to classify blazar candidates of uncertain type (BCUs) into BL Lacertae objects (BL Lacs) or Flat Spectrum Radio Quasars (FSRQs) with an accuracy of 92.03%, namely, Mission B. There are 1354 AGN candidates in Mission A, 482 BL Lacs candidates and 128 FSRQ candidates in Mission B were found. The results show a high consistency of greater than 98% with the results in previous works. In addition, our method has the advantage of finding less variable and relatively faint sources than ordinary methods.


IoT Data Trust Evaluation via Machine Learning

Tadj, Timothy, Arablouei, Reza, Dedeoglu, Volkan

arXiv.org Artificial Intelligence

Various approaches based on supervised or unsupervised machine learning (ML) have been proposed for evaluating IoT data trust. However, assessing their real-world efficacy is hard mainly due to the lack of related publicly-available datasets that can be used for benchmarking. Since obtaining such datasets is challenging, we propose a data synthesis method, called random walk infilling (RWI), to augment IoT time-series datasets by synthesizing untrustworthy data from existing trustworthy data. Thus, RWI enables us to create labeled datasets that can be used to develop and validate ML models for IoT data trust evaluation. We also extract new features from IoT time-series sensor data that effectively capture its auto-correlation as well as its cross-correlation with the data of the neighboring (peer) sensors. These features can be used to learn ML models for recognizing the trustworthiness of IoT sensor data. Equipped with our synthesized ground-truth-labeled datasets and informative correlation-based feature, we conduct extensive experiments to critically examine various approaches to evaluating IoT data trust via ML. The results reveal that commonly used ML-based approaches to IoT data trust evaluation, which rely on unsupervised cluster analysis to assign trust labels to unlabeled data, perform poorly. This poor performance can be attributed to the underlying unsubstantiated assumption that clustering provides reliable labels for data trust, a premise that is found to be untenable. The results also show that the ML models learned from datasets augmented via RWI while using the proposed features generalize well to unseen data and outperform existing related approaches. Moreover, we observe that a semi-supervised ML approach that requires only about 10% of the data labeled offers competitive performance while being practically more appealing compared to the fully-supervised approaches.


FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks

Kong, Menglin, Zhao, Shaojie, Cheng, Juan, Li, Xingquan, Su, Ri, Hou, Muzhou, Cao, Cong

arXiv.org Artificial Intelligence

There are two fundamental problems in applying deep learning/machine learning methods to disease classification tasks, one is the insufficient number and poor quality of training samples; another one is how to effectively fuse multiple source features and thus train robust classification models. To address these problems, inspired by the process of human learning knowledge, we propose the Feature-aware Fusion Correlation Neural Network (FaFCNN), which introduces a feature-aware interaction module and a feature alignment module based on domain adversarial learning. This is a general framework for disease classification, and FaFCNN improves the way existing methods obtain sample correlation features. The experimental results show that training using augmented features obtained by pre-training gradient boosting decision tree yields more performance gains than random-forest based methods. On the low-quality dataset with a large amount of missing data in our setup, FaFCNN obtains a consistently optimal performance compared to competitive baselines. In addition, extensive experiments demonstrate the robustness of the proposed method and the effectiveness of each component of the model\footnote{Accepted in IEEE SMC2023}.


Aligning Correlation Information for Domain Adaptation in Action Recognition

Xu, Yuecong, Yang, Jianfei, Cao, Haozhi, Mao, Kezhi, Yin, Jianxiong, See, Simon

arXiv.org Artificial Intelligence

Domain adaptation (DA) approaches address domain shift and enable networks to be applied to different scenarios. Although various image DA approaches have been proposed in recent years, there is limited research towards video DA. This is partly due to the complexity in adapting the different modalities of features in videos, which includes the correlation features extracted as long-term dependencies of pixels across spatiotemporal dimensions. The correlation features are highly associated with action classes and proven their effectiveness in accurate video feature extraction through the supervised action recognition task. Yet correlation features of the same action would differ across domains due to domain shift. Therefore we propose a novel Adversarial Correlation Adaptation Network (ACAN) to align action videos by aligning pixel correlations. ACAN aims to minimize the distribution of correlation information, termed as Pixel Correlation Discrepancy (PCD). Additionally, video DA research is also limited by the lack of cross-domain video datasets with larger domain shifts. We, therefore, introduce a novel HMDB-ARID dataset with a larger domain shift caused by a larger statistical difference between domains. This dataset is built in an effort to leverage current datasets for dark video classification. Empirical results demonstrate the state-of-the-art performance of our proposed ACAN for both existing and the new video DA datasets.


Multi-Level Deep Cascade Trees for Conversion Rate Prediction

Wen, Hong, Zhang, Jing, Lin, Quan, Yang, Keping, Jin, Taiwei, Lv, Fuyu, Pan, Xiaofeng, Huang, Pipei, Zha, Zheng-Jun

arXiv.org Machine Learning

Developing effective and efficient recommendation methods is very challenging for modern e-commerce platforms (e.g. Taobao). Generally speaking, two essential modules named "Click-Through Rate Prediction" (CTR) and "Conversion Rate Prediction" (CVR) are included, where CVR module is a crucial factor that affects the final purchasing volume directly. However, it is indeed very challenging due to its sparseness nature. In this paper, we tackle this problem by proposing multi-Level Deep Cascade Trees (ldcTree), which is a novel decision tree ensemble approach. It leverages deep cascade structures by stacking Gradient Boosting Decision Trees (GBDT) to effectively learn feature representation. In addition, we propose to utilize the cross-entropy in each tree of the preceding GBDT as the input feature representation for next level GBDT, which has a clear explanation, i.e., a traversal from root to leaf nodes in the next level GBDT corresponds to the combination of certain traversals in the preceding GBDT. The deep cascade structure and the combination rule enable the proposed ldcTree to have a stronger distributed feature representation ability. Moreover, inspired by ensemble learning, we propose an Ensemble ldcTree (E-ldcTree) to encourage the model's diversity and enhance the representation ability further. Finally, we propose an improved Feature learning method based on EldcTree (F-EldcTree) for taking adequate use of weak and strong correlation features identified by pre-trained GBDT models. Experimental results on off-line dataset and online deployment demonstrate the effectiveness of the proposed methods.


Personalized Ranking Metric Embedding for Next New POI Recommendation

Feng, Shanshan (Nanyang Technological University) | Li, Xutao (Nanyang Technological University) | Zeng, Yifeng (Teesside University) | Cong, Gao (Nanyang Technological University) | Chee, Yeow Meng (Nanyang Technological University) | Yuan, Quan (Nanyang Technological University)

AAAI Conferences

The rapidly growing of Location-based Social Networks (LBSNs) provides a vast amount of check-in data, which enables many services, e.g., point-of-interest (POI) recommendation. In this paper, we study the next new POI recommendation problem in which new POIs with respect to users' current location are to be recommended. The challenge lies in the difficulty in precisely learning users' sequential information and personalizing the recommendation model. To this end, we resort to the Metric Embedding method for the recommendation, which avoids drawbacks of the Matrix Factorization technique. We propose a personalized ranking metric embedding method (PRME) to model personalized check-in sequences. We further develop a PRME-G model, which integrates sequential information, individual preference, and geographical influence, to improve the recommendation performance. Experiments on two real-world LBSN datasets demonstrate that our new algorithm outperforms the state-of-the-art next POI recommendation methods.


Personalized Ranking Metric Embedding for Next New POI Recommendation

Feng, Shanshan (Nanyang Technological University) | Li, Xutao (Nanyang Technological University) | Zeng, Yifeng (Teesside University) | Cong, Gao (Nanyang Technological University) | Chee, Yeow Meng (Nanyang Technological University) | Yuan, Quan (Nanyang Technological University)

AAAI Conferences

The rapidly growing of Location-based Social Networks (LBSNs) provides a vast amount of check-in data, which enables many services, e.g., point-of-interest (POI) recommendation. In this paper, we study the next new POI recommendation problem in which new POIs with respect to users' current location are to be recommended. The challenge lies in the difficulty in precisely learning users' sequential information and personalizing the recommendation model. To this end, we resort to the Metric Embedding method for the recommendation, which avoids drawbacks of the Matrix Factorization technique. We propose a personalized ranking metric embedding method (PRME) to model personalized check-in sequences. We further develop a PRME-G model, which integrates sequential information, individual preference, and geographical influence, to improve the recommendation performance. Experiments on two real-world LBSN datasets demonstrate that our new algorithm outperforms the state-of-the-art next POI recommendation methods.